Skip to content

added: NonLinMPC and MovingHorizonEstimator integration with DI.jl #174

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 33 commits into from
Mar 20, 2025

Conversation

franckgaga
Copy link
Member

@franckgaga franckgaga commented Mar 12, 2025

Nonlinear MPC and moving horzion estimator now use DifferentiationInterface.jl

Both NonLinMPC and MovingHorizonEstimator constructor now accept a gradient and jacobian keyword arguments, receiving a AbstractADType to switch the backend of the objective gradient and constraint Jacobians, respectively. Note that the nonlinear inequality and equality constraints will both use the backend provided by the jacobian argument. The possibility to provide a hessian will be added in a future PR (the LBFGS approximation is used inside Ipopt for now, as it was the case before)

By default, denses AutoForwardDiff() is use everywhere, except for NonLinMPC with non-SingleShooting transcription method, in which this backend is used:

AutoSparse(
    AutoForwardDiff(); 
    sparsity_detector  = TracerSparsityDetector(), 
    coloring_algorithm = GreedyColoringAlgorithm()
)

For both objects, the differentiation make use of DifferentiationInterface.Constant() and DifferentiationInterface.Cache() functionalities. As a consequence, they does not rely on PreallocationTools.DiffCaches anymore. But, I still need to keep this dependency since linearize! and all DiffSolvers still rely on them for now.

Documentation now use DocumenterInterLink.jl

Some URL were very long in the docstrings. The are now way shorter. It also ease the maintenance for the documentation of external dependencies.

The doc preview is here: https://juliacontrol.github.io/ModelPredictiveControl.jl/previews/PR174

@codecov-commenter
Copy link

codecov-commenter commented Mar 18, 2025

Codecov Report

Attention: Patch coverage is 99.32432% with 1 line in your changes missing coverage. Please review.

Project coverage is 98.87%. Comparing base (c12d7bc) to head (990f9c5).

Files with missing lines Patch % Lines
src/controller/execute.jl 85.71% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #174      +/-   ##
==========================================
+ Coverage   98.85%   98.87%   +0.01%     
==========================================
  Files          25       25              
  Lines        4180     4160      -20     
==========================================
- Hits         4132     4113      -19     
+ Misses         48       47       -1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@gdalle
Copy link
Contributor

gdalle commented Mar 18, 2025

Heads up, DI v0.6.46 supports nested tuples and named tuples of arrays as caches

@franckgaga
Copy link
Member Author

Thanks for the info, right now I do:

mycontext = (
    Constant(a),
    Cache(b),
    Cache(c),
    Cache(d),
)
grad_prep = prepare_gardient!(myfunc, grad_backend, x, mycontext...; strict=Val(true))

and a similar splatting for the context in gradient!. Do you this splatting is problematic for the performance ? I did some quick benchamarks and I measured no impact at all.

@gdalle
Copy link
Contributor

gdalle commented Mar 18, 2025

I don't think it is problematic, but for some users it is a bit painful: when you have tens of caches you may want to structure the more neatly, so I give that option too.

@franckgaga
Copy link
Member Author

Alright thanks! I thinks it's neat enough in the code like that.

By using a closure on `mpc::NonLinMPC` and `estim::MovingHorizonEstimator`
@franckgaga
Copy link
Member Author

@baggepinnen I will merge the PR, you can comment here if you spot anything not clear or sus

@franckgaga franckgaga merged commit bb68996 into main Mar 20, 2025
4 checks passed
@baggepinnen baggepinnen deleted the diff_interface_final branch March 21, 2025 06:24
@baggepinnen
Copy link
Member

Did you end up getting things to work with Symbolics? If so, it would be nice to see if you get any benefits from the new common-subexpression elimination (CSE) optimization that was recently enabled. See JuliaDiff/DifferentiationInterface.jl#758 for a discussion, I got 13x speedup on a sparse hessian example for a cartpole system. My guess is that this optimization will give the most benefit for mechanical systems with non-trivial mass matrix, e.g., 2DOF and higher

@gdalle
Copy link
Contributor

gdalle commented Mar 28, 2025

Unfortunately not, but here's an MWE if you have a few cycles to dig into why Symbolics fails. I think it is due to Symbolics because jacobian is the only operator that fails with DI.Cache, all the others work as intended:

using DifferentiationInterface
using Symbolics: Symbolics

function f!(y, x)
    y[1:length(x)] .= sin.(x)
    y[(length(x) + 1):(2length(x))] .= cos.(x)
    return nothing
end

function f!_cache(y, x, c)
    f!(c, x)
    copyto!(y, c)
    return nothing
end
julia> x = [1.0, 2.0];

julia> y, c = zeros(4), zeros(4);

julia> jacobian(f!, y, AutoSymbolics(), x)  # correct
4×2 Matrix{Float64}:
  0.540302   0.0
  0.0       -0.416147
 -0.841471   0.0
  0.0       -0.909297

julia> jacobian(f!_cache, y, AutoSymbolics(), x, Cache(c))  # incorrect
4×2 Matrix{Float64}:
  0.0   0.0
  0.0   0.0
 -0.0   0.0
  0.0  -0.0

@gdalle
Copy link
Contributor

gdalle commented Mar 28, 2025

Here's a pure Symbolics MWE: JuliaSymbolics/Symbolics.jl#1509

@baggepinnen
Copy link
Member

The issue is that context_vars is overwritten when computing f!(y_var, x_var, context_vars...) so that

context_vars =
Vector{Symbolics.Num}[[sin(x₁), sin(x₂), cos(x₁), cos(x₂)]]

when the funciton is later built, it looks like the output is a function of those sin(x₁), sin(x₂), cos(x₁), cos(x₂) inputs, but when cache is passed as an array of zeros you get zeros out.

I don't think the symbolics backend will benefit from a cache, it will trace through the function and build up an expression that will be used to symbolically compute the jacobian. this tracing does not care about any intermediate allocations while building the expression, since those will not be represented in the expression once it has been built

@baggepinnen
Copy link
Member

Another way of thinking about it is that Symbolics cannot represent computation, it can only represent expressions. The final expression for the output is the same no matter whether the cache is used as an intermediate or not.

@gdalle
Copy link
Contributor

gdalle commented Mar 28, 2025

Oh ok, I understand better now. But then it is weird that no other operator reports errors when used with a cache.

I know that Symbolics doesn't care about caching, but if people define their function with a cache to use DI's functionality with other backends, then Symbolics also needs to behave the same way (even if internally it gets rid of the cache).

@baggepinnen
Copy link
Member

If you call f_cache once before such that c contains the correct intermediate values, you do get the correct jacobian out

julia> f!_cache(y, x, c)

julia> jacobian(f!_cache, y, AutoSymbolics(), x, Cache(c))
4×2 Matrix{Float64}:
  0.540302   0.0
  0.0       -0.416147
 -0.841471   0.0
  0.0       -0.909297

but there is no benefit in using such a cumbersome approach, the built jacobian funciton should be allocation free even if no cache was used during the symbolic tracing

@gdalle
Copy link
Contributor

gdalle commented Mar 28, 2025

So I guess the solution would be to build the function based solely on x and any constant contexts, but not cache contexts?

@baggepinnen
Copy link
Member

So I guess the solution would be to build the function based solely on x and any constant contexts, but not cache contexts?

Yes, I think this should be fine. You might need to wrap the built function such that it takes the correct number of arguments if that's expected downstream. Alternatively, you could recreate context_vars such that they are once again given by cache_1, cache_2... and then symbolics will not find them in the output and you'd get the function computing everything in the output from x_vars instead of from cache_vars

@gdalle
Copy link
Contributor

gdalle commented Mar 28, 2025

Fix incoming in JuliaDiff/DifferentiationInterface.jl#760 but there is still something I don't understand.
Let's say I run a function f!(y, x, c) and at the end, c = sin(x[1]) and y = cos(c[1]) = cos(sin(x[1])). How does Symbolics decide that it will be defining y based on c instead of x?

@baggepinnen
Copy link
Member

I am not sure, but my guess is that there is a tree search on the output expression taking place, and the first subexpression that matches any of the input arguments will be used, and when exploring the tree, sin(x[1]) appears closer to the root than x[1]

@franckgaga
Copy link
Member Author

Fix incoming in JuliaDiff/DifferentiationInterface.jl#760 but there is still something I don't understand. Let's say I run a function f!(y, x, c) and at the end, c = sin(x[1]) and y = cos(c[1]) = cos(sin(x[1])). How does Symbolics decide that it will be defining y based on c instead of x?

Just tested AutoSymbolics() and it works perfectly now, good job @gdalle! 🥳 🥳 🥳

I will do some benchmarks to see if it's faster than AutoForwardDiff() in the sparse case.

@gdalle
Copy link
Contributor

gdalle commented Mar 30, 2025

I'm rather curious!

@franckgaga
Copy link
Member Author

The benchmarks on the inverted pendulum with MultipleShooting and Ipopt are:

# dense AutoForwardDiff and MultipleShooting:
btime_NMPC_track_solver_IP = median(bm) = TrialEstimate(1.253 s)

# dense AutoSymbolics and MultipleShooting:
btime_NMPC_track_solver_IP = median(bm) = TrialEstimate(1.077 s)

# sparse AutoForwardDiff and MultipleShooting:
btime_NMPC_track_solver_IP = median(bm) = TrialEstimate(1.269 s)

So even on this really small and simple ODE with a trivial mass matrix (but unstable) there is a gain with AutoSymbolics.

Now, I'm not able to run it with sparse AutoSymbolics:

backend = AutoSparse(
    AutoSymbolics(); 
    sparsity_detector  = TracerSparsityDetector(), 
    coloring_algorithm = GreedyColoringAlgorithm()
)

the error is:

ERROR: MethodError: no method matching sparsity_pattern(::DifferentiationInterfaceSymbolicsExt.SymbolicsTwoArgJacobianPrep{…})
The function `sparsity_pattern` exists, but no method is defined for this combination of argument types.

Closest candidates are:
  sparsity_pattern(::DifferentiationInterfaceSparseMatrixColoringsExt.SparseJacobianPrep)
   @ DifferentiationInterfaceSparseMatrixColoringsExt ~/.julia/packages/DifferentiationInterface/Av6Jb/ext/DifferentiationInterfaceSparseMatrixColoringsExt/DifferentiationInterfaceSparseMatrixColoringsExt.jl:20
  sparsity_pattern(::DifferentiationInterfaceSparseMatrixColoringsExt.SparseHessianPrep)
   @ DifferentiationInterfaceSparseMatrixColoringsExt ~/.julia/packages/DifferentiationInterface/Av6Jb/ext/DifferentiationInterfaceSparseMatrixColoringsExt/hessian.jl:21
  sparsity_pattern(::AbstractColoringResult)
   @ SparseMatrixColorings ~/.julia/packages/SparseMatrixColorings/BED6Y/src/result.jl:127

Stacktrace:
 [1] init_diffmat(T::Type, backend::AutoSparse{…}, prep::DifferentiationInterfaceSymbolicsExt.SymbolicsTwoArgJacobianPrep{…}, ::Int64, ::Int64)
   @ ModelPredictiveControl ~/.julia/dev/ModelPredictiveControl/src/general.jl:61
 [2] get_optim_functions(mpc::NonLinMPC{…}, ::Model)
   @ ModelPredictiveControl ~/.julia/dev/ModelPredictiveControl/src/controller/nonlinmpc.jl:659
 [3] init_optimization!(mpc::NonLinMPC{…}, model::NonLinModel{…}, optim::Model)
   @ ModelPredictiveControl ~/.julia/dev/ModelPredictiveControl/src/controller/nonlinmpc.jl:543
 [4] (NonLinMPC{})(estim::UnscentedKalmanFilter{…}, Hp::Int64, Hc::Int64, M_Hp::Matrix{…}, N_Hc::Matrix{…}, L_Hp::Matrix{…}, Cwt::Float64, Ewt::Float64, JE::ModelPredictiveControl.var"#159#163", gc!::ModelPredictiveControl.var"#160#164", nc::Int64, p::Vector{…}, transcription::MultipleShooting, optim::Model, gradient::AutoForwardDiff{…}, jacobian::AutoSparse{…})
   @ ModelPredictiveControl ~/.julia/dev/ModelPredictiveControl/src/controller/nonlinmpc.jl:126
 [5] NonLinMPC(estim::UnscentedKalmanFilter{…}; Hp::Int64, Hc::Int64, Mwt::Vector{…}, Nwt::Vector{…}, Lwt::Vector{…}, M_Hp::Matrix{…}, N_Hc::Matrix{…}, L_Hp::Matrix{…}, Cwt::Float64, Ewt::Float64, JE::ModelPredictiveControl.var"#159#163", gc!::ModelPredictiveControl.var"#160#164", gc::ModelPredictiveControl.var"#160#164", nc::Int64, p::Vector{…}, transcription::MultipleShooting, optim::Model, gradient::AutoForwardDiff{…}, jacobian::AutoSparse{…})
   @ ModelPredictiveControl ~/.julia/dev/ModelPredictiveControl/src/controller/nonlinmpc.jl:393
 [6] top-level scope
   @ ~/Dropbox/Programmation/Julia/TestMPC/src/nonlin_mpc_article.jl:52
Some type information was truncated. Use `show(err)` to see complete types.

Is it expected ?

@gdalle
Copy link
Contributor

gdalle commented Mar 30, 2025

No this is not expected but I'm not sure I can solve it either. Essentially, I overload analysis functions like SMC.sparsity_pattern or SMC.ncolors on generic sparse preparation result types. The thing is, Symbolics doesn't take the usual sparse AD pipeline, so there won't be a coloring to anlayze in the first place (and the sparsity pattern isn't computed explicitly either). If you want the sparsity pattern, you'll have to get it separately with ADTypes.jacobian_sparsity and Symbolics.SymbolicsSparsityDetector().

@gdalle
Copy link
Contributor

gdalle commented Mar 30, 2025

Can you open an issue in the DI repo for me to make the error more informative?

@franckgaga
Copy link
Member Author

@gdalle
Copy link
Contributor

gdalle commented Mar 31, 2025

Could you benchmark against sparse FastDifferentiation too?

@franckgaga
Copy link
Member Author

The benchmarks on the inverted pendulum with MultipleShooting and Ipopt are:

# dense AutoForwardDiff and MultipleShooting:
btime_NMPC_track_solver_IP = median(bm) = TrialEstimate(1.253 s)

# dense AutoSymbolics and MultipleShooting:
btime_NMPC_track_solver_IP = median(bm) = TrialEstimate(1.077 s)

# sparse AutoForwardDiff and MultipleShooting:
btime_NMPC_track_solver_IP = median(bm) = TrialEstimate(1.269 s)

So with the new bugfix related sparse symbolic backends, I get:

# sparse AutoSymbolics and MultipleShooting:
btime_NMPC_track_solver_IP = median(bm) = TrialEstimate(1.045 s)

Thus not much difference with dense computations on this problem. Note that the bottleneck may be Ipopt here. If that's true even faster differentiation algorithms would not change much the result. But I would need to do some profiling to verify this.

For FastDifferentiation, both dense and sparse backend does not work. The error on the dense case is:

ERROR: AssertionError: Should only be one path from root 5 to variable 1. Instead have 2 children from node 122 on the path
Stacktrace:
  [1] follow_path(a::FastDifferentiation.DerivativeGraph{Int64}, root_index::Int64, var_index::Int64)
    @ FastDifferentiation ~/.julia/packages/FastDifferentiation/kJ1qL/src/Factoring.jl:525
  [2] evaluate_path
    @ ~/.julia/packages/FastDifferentiation/kJ1qL/src/Factoring.jl:555 [inlined]
  [3] _symbolic_jacobian!(graph::FastDifferentiation.DerivativeGraph{…}, partial_variables::Vector{…})
    @ FastDifferentiation ~/.julia/packages/FastDifferentiation/kJ1qL/src/Jacobian.jl:47
  [4] _symbolic_jacobian
    @ ~/.julia/packages/FastDifferentiation/kJ1qL/src/Jacobian.jl:61 [inlined]
  [5] jacobian
    @ ~/.julia/packages/FastDifferentiation/kJ1qL/src/Jacobian.jl:90 [inlined]
  [6] prepare_jacobian_nokwarg(::Val{…}, ::Function, ::Vector{…}, ::AutoFastDifferentiation, ::Vector{…}, ::Cache{…}, ::Cache{…}, ::Cache{…}, ::Cache{…}, ::Cache{…}, ::Cache{…}, ::Cache{…}, ::Cache{…}, ::Cache{…}, ::Cache{…}, ::Cache{…})
    @ DifferentiationInterfaceFastDifferentiationExt ~/.julia/packages/DifferentiationInterface/7eD1K/ext/DifferentiationInterfaceFastDifferentiationExt/twoarg.jl:320
  [7] #prepare_jacobian#48
    @ ~/.julia/packages/DifferentiationInterface/7eD1K/src/first_order/jacobian.jl:23 [inlined]
  [8] prepare_jacobian
    @ ~/.julia/packages/DifferentiationInterface/7eD1K/src/first_order/jacobian.jl:15 [inlined]
  [9] get_optim_functions(mpc::NonLinMPC{…}, ::JuMP.Model)
    @ ModelPredictiveControl ~/.julia/dev/ModelPredictiveControl/src/controller/nonlinmpc.jl:706
 [10] init_optimization!(mpc::NonLinMPC{…}, model::NonLinModel{…}, optim::JuMP.Model)
    @ ModelPredictiveControl ~/.julia/dev/ModelPredictiveControl/src/controller/nonlinmpc.jl:543
 [11] (NonLinMPC{})(estim::UnscentedKalmanFilter{…}, Hp::Int64, Hc::Int64, M_Hp::Matrix{…}, N_Hc::Matrix{…}, L_Hp::Matrix{…}, Cwt::Float64, Ewt::Float64, JE::ModelPredictiveControl.var"#159#163", gc!::ModelPredictiveControl.var"#160#164", nc::Int64, p::Vector{…}, transcription::MultipleShooting, optim::JuMP.Model, gradient::AutoForwardDiff{…}, jacobian::AutoFastDifferentiation)
    @ ModelPredictiveControl ~/.julia/dev/ModelPredictiveControl/src/controller/nonlinmpc.jl:126
 [12] NonLinMPC(estim::UnscentedKalmanFilter{…}; Hp::Int64, Hc::Int64, Mwt::Vector{…}, Nwt::Vector{…}, Lwt::Vector{…}, M_Hp::Matrix{…}, N_Hc::Matrix{…}, L_Hp::Matrix{…}, Cwt::Float64, Ewt::Float64, JE::ModelPredictiveControl.var"#159#163", gc!::ModelPredictiveControl.var"#160#164", gc::ModelPredictiveControl.var"#160#164", nc::Int64, p::Vector{…}, transcription::MultipleShooting, optim::JuMP.Model, gradient::AutoForwardDiff{…}, jacobian::AutoFastDifferentiation)
    @ ModelPredictiveControl ~/.julia/dev/ModelPredictiveControl/src/controller/nonlinmpc.jl:393
 [13] top-level scope
    @ ~/Dropbox/Programmation/Julia/TestMPC/src/nonlin_mpc_article.jl:52
Some type information was truncated. Use `show(err)` to see complete types.

The error is different on the sparse case. I can post it also here if you are interested @gdalle, but I think it's not relevant for now since it first needs to work in the dense case before it starts working in the sparse case, logically.

@gdalle
Copy link
Contributor

gdalle commented Mar 31, 2025

Could you post a full MWE in a DI issue? I think this is related to cache handling in FastDifferentiation, probably a similar error as the one you diagnosed earlier with Symbolics

@franckgaga
Copy link
Member Author

I will try to create a MWE.

@franckgaga
Copy link
Member Author

franckgaga commented Mar 31, 2025

I don't think it's related to cache handling since this MWE with cache work as expected:

using DifferentiationInterface
import FastDifferentiation
x = float.(1:8)
y = Vector{eltype(x)}(undef, length(x)-1)
c = Cache(y)
function f!(y, x, c)
    for i in eachindex(y)
        c[i] = (x[i+1]^2-x[i]^2)
    end
    for i in eachindex(y)
        j = length(x) - i
        c[i] += (x[j]^2-x[j+1]^2)
    end
    y .= c
    return y
end
backend = AutoFastDifferentiation()
prep = prepare_jacobian(f!, y, backend, x, c; strict=Val(true))
J = jacobian(f!, y, prep, backend, x, c)

giving:

7×8 Matrix{Float64}:
 -2.0   4.0   0.0   0.0    0.0    0.0   14.0  -16.0
  0.0  -4.0   6.0   0.0    0.0   12.0  -14.0    0.0
  0.0   0.0  -6.0   8.0   10.0  -12.0    0.0    0.0
  0.0   0.0   0.0   0.0    0.0    0.0    0.0    0.0
  0.0   0.0   6.0  -8.0  -10.0   12.0    0.0    0.0
  0.0   4.0  -6.0   0.0    0.0  -12.0   14.0    0.0
  2.0  -4.0   0.0   0.0    0.0    0.0  -14.0   16.0

I think we are hitting some language constructs not supported by FastDifferentiation. I have some for loops and if conditionals in the nonlinear objective and nonlinear constraint functions. I can still raise an issue on DI.jl with a not-so-minimal reproductive exemple that uses MPC.jl, if you are interested.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Use DifferentiationInterface.jl
4 participants